Contrasting objective functions for CYK chart decoding
نویسندگان
چکیده
Context-free inference is a standard part of many NLP pipelines. Most approaches use a variant of the CYK dynamic programming algorithm to populate a chart structure with predicted nonterminals over each span. We can extract a parse tree from this chart in several ways. In this work, we compare two commonly-used decoding approaches (Viterbi and max-rule) with a minimum-bayes-risk (MBR) method which has not been widely used. We find that the latter approach is competitive with and in some cases superior to the standard decoding methods.
منابع مشابه
A CYK+ Variant for SCFG Decoding Without a Dot Chart
While CYK+ and Earley-style variants are popular algorithms for decoding unbinarized SCFGs, in particular for syntaxbased Statistical Machine Translation, the algorithms rely on a so-called dot chart which suffers from a high memory consumption. We propose a recursive variant of the CYK+ algorithm that eliminates the dot chart, without incurring an increase in time complexity for SCFG decoding....
متن کاملBeam-Width Prediction for Efficient Context-Free Parsing
Efficient decoding for syntactic parsing has become a necessary research area as statistical grammars grow in accuracy and size and as more NLP applications leverage syntactic analyses. We review prior methods for pruning and then present a new framework that unifies their strengths into a single approach. Using a log linear model, we learn the optimal beam-search pruning parameters for each CY...
متن کاملSpeeding Up Full Syntactic Parsing by Leveraging Partial Parsing Decisions
Parsing is a computationally intensive task due to the combinatorial explosion seen in chart parsing algorithms that explore possible parse trees. In this paper, we propose a method to limit the combinatorial explosion by restricting the CYK chart parsing algorithm based on the output of a chunk parser. When tested on the three parsers presented in (Collins, 1999), we observed an approximate th...
متن کاملAn Efficient Two-Pass Approach to Synchronous-CFG Driven Statistical MT
We present an efficient, novel two-pass approach to mitigate the computational impact resulting from online intersection of an n-gram language model (LM) and a probabilistic synchronous context-free grammar (PSCFG) for statistical machine translation. In first pass CYK-style decoding, we consider first-best chart item approximations, generating a hypergraph of sentence spanning target language ...
متن کاملAn Efficient Shift-Reduce Decoding Algorithm for Phrased-Based Machine Translation
In statistical machine translation, decoding without any reordering constraint is an NP-hard problem. Inversion Transduction Grammars (ITGs) exploit linguistic structure and can well balance the needed flexibility against complexity constraints. Currently, translation models with ITG constraints usually employs the cube-time CYK algorithm. In this paper, we present a shift-reduce decoding algor...
متن کامل